1National Key Laboratory of Multispectral Information Intelligent Processing Technology, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, 430074, China
2PhenoTrait Technology Co., Ltd., Beijing, 100096, China
3State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
4MetaPheno Laboratory, Shanghai, 201114, China
5SpeCloud Technology Co., Ltd., Sanya, 572025, China
6These authors contributed equally to this work. Work done during internship at PhenoTrait Technology Co., Ltd.
7In this paper, we slightly abuse the term ‘crop’ as we do not discriminate crop and weed and consider general greenness extraction.
| Received 04 Sep 2024 |
Accepted 10 Dec 2024 |
Published 27 Feb 2025 |
We present Depth-Informed Crop Segmentation (DepthCropSeg), an almost unsupervised crop segmentation approach without manual pixel-level annotations. Crop segmentation is a fundamental vision task in agriculture, which benefits a number of downstream applications such as crop growth monitoring and yield estimation. Over the past decade, image-based crop segmentation approaches have shifted from classic color-based paradigms to recent deep learning-based ones. The latter, however, rely heavily on large amounts of data with high-quality manual annotation such that considerable human labor and time are spent. In this work, we leverage Depth Anything V2, a vision foundation model, to produce high-quality pseudo crop masks for training segmentation models. We compile a dataset of 17,199 images from six public plant segmentation sources, generating pseudo masks from depth maps after normalization and thresholding. After a coarse-to-fine manual screening, 1378 images with reliable masks are selected. We compare four semantic segmentation models and enhance the top-performing one with depth-informed two-stage self-training and depth-informed post-processing. To evaluate the feasibility and robustness of DepthCropSeg, we benchmark the segmentation performance on 10 public crop segmentation testing sets and a self-collect dataset covering in-field, laboratory, and unmanned aerial vehicle (UAV) scenarios. Experimental results show that our DepthCropSeg approach can achieve crop segmentation performance comparable to the fully supervised model trained with manually annotated data (86.91 vs. 87.10). For the first time, we demonstrate almost unsupervised, close-to-full-supervision crop segmentation successfully.